On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation

نویسندگان

Harikrishna Narasimhan

Shivani Agarwal

چکیده

We investigate the relationship between three fundamental problems in machinelearning: binary classification, bipartite ranking, and binary class probability esti-mation (CPE). It is known that a good binary CPE model can be used to obtain agood binary classification model (by thresholding at 0.5), and also to obtain a goodbipartite ranking model (by using the CPE model directly as a ranking model); itis also known that a binary classification model does not necessarily yield a CPEmodel. However, not much is known about other directions. Formally, these rela-tionships involve regret transfer bounds. In this paper, we introduce the notion ofweak regret transfer bounds, where the mapping needed to transform a model fromone problem to another depends on the underlying probability distribution (and inpractice, must be estimated from data). We then show that, in this weaker sense, agood bipartite ranking model can be used to construct a good classification model(by thresholding at a suitable point), and more surprisingly, also to construct agood binary CPE model (by calibrating the scores of the ranking model).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayes-Optimal Scorers for Bipartite Ranking

We address the following seemingly simple question: what is the Bayes-optimal scorer for a bipartite ranking risk? The answer to this question helps elucidate the relationship between bipartite ranking and other established learning problems. We show that the answer is non-trivial in general, but may be easily determined for certain special cases using the theory of proper losses. Our analysis ...

متن کامل

Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification

Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...

متن کامل

Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking

Bipartite ranking is a fundamental ranking problem that learns to order relevant instances ahead of irrelevant ones. One major approach for bipartite ranking, called the pair-wise approach, tackles an equivalent binary classification problem of whether one instance out of a pair of instances should be ranked higher than the other. Nevertheless, the number of instance pairs constructed from the ...

متن کامل

Bipartite ranking: risk, optimality, and equivalences

We present a systematic study of the bipartite ranking problem, with the aim of delineating its connections to the class-probability estimation problem. Our study focuses on the properties of the statistical risk for bipartite ranking, which is closely related to the area under the ROC curve: we establish alternate representations of the risk, relate the Bayes-optimal risk to a class of probabi...

متن کامل

On classification, ranking, and probability estimation

Given a binary classification task, a ranker is an algorithm that can sort a set of instances from highest to lowest expectation that the instance is positive. In contrast to a classifier, a ranker does not output class predictions – although it can be turned into a classifier with help of an additional procedure to split the ranked list into two. A straightforward way to compute rankings is to...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation

نویسندگان

چکیده

منابع مشابه

Bayes-Optimal Scorers for Bipartite Ranking

Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification

Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking

Bipartite ranking: risk, optimality, and equivalences

On classification, ranking, and probability estimation

عنوان ژورنال:

اشتراک گذاری